Fully automated assessment of the future liver remnant in a blood-free setting via CT before major hepatectomy via deep learning

Objectives To develop and validate a deep learning (DL) model for automated segmentation of hepatic and portal veins, and apply the model in blood-free future liver remnant (FLR) assessments via CT before major hepatectomy. Methods 3-dimensional 3D U-Net models were developed for the automatic segmentation of hepatic veins and portal veins on contrast-enhanced CT images. A total of 170 patients treated from January 2018 to March 2019 were included. 3D U-Net models were trained and tested under various liver conditions. The Dice similarity coefficient (DSC) and volumetric similarity (VS) were used to evaluate the segmentation accuracy. The use of quantitative volumetry for evaluating resection was compared between blood-filled and blood-free settings and between manual and automated segmentation. Results The DSC values in the test dataset for hepatic veins and portal veins were 0.66 ± 0.08 (95% CI: (0.65, 0.68)) and 0.67 ± 0.07 (95% CI: (0.66, 0.69)), the VS values were 0.80 ± 0.10 (95% CI: (0.79, 0.84)) and 0.74 ± 0.08 (95% CI: (0.73, 0.76)), respectively No significant differences in FLR, FLR% assessments, or the percentage of major hepatectomy patients were noted between the blood-filled and blood-free settings (p = 0.67, 0.59 and 0.99 for manual methods, p = 0.66, 0.99 and 0.99 for automated methods, respectively) according to the use of manual and automated segmentation methods. Conclusion Fully automated segmentation of hepatic veins and portal veins and FLR assessment via blood-free CT before major hepatectomy are accurate and applicable in clinical cases involving the use of DL. Critical relevance statement Our fully automatic models could segment hepatic veins, portal veins, and future liver remnant in blood-free setting on CT images before major hepatectomy with reliable outcomes. Key Points Fully automatic segmentation of hepatic veins and portal veins was feasible in clinical practice. Fully automatic volumetry of future liver remnant (FLR)% in a blood-free setting was robust. No significant differences in FLR% assessments were noted between the blood-filled and blood-free settings. Graphical Abstract


Graphical Abstract
Fully automated segmentation of hepatic and portal veins and FLR assessment via blood-free CT using deep learning before major hepatectomy are accurate and applicable in clinical cases.
Fully automated assessment of the future liver remnant in a blood-free setting via CT before major hepatectomy via deep learning

Background
Post-hepatectomy liver failure (PHLF) is regarded as the primary factor contributing to mortality after major hepatectomy [1-3].Its prevalence is variable and up to 12-34% [4].The future liver remnant (FLR) volume is considered one of the most important predictors of PHLF [5], and CT volumetry of the FLR has become an essential procedure before major hepatectomy in clinical practice.However, preoperative CT volumetry has been criticized for under-and overestimating real volumes for several reasons, including (i) the preoperative assumed planes used for volumetry differ from the actual resection planes and (ii) the blood volume contained in the large hepatic vessels (V blood , i.e., the volume of portal veins and hepatic veins) in the graft contributes to the difference between CT volumetry and real liver grafts because volumetry on CT images is blood-filled while intraoperative measurements are blood-free [4,6,7].Since the blood pool comprises more than 9% of the whole liver volume [8], V blood should be taken into account.Considering that the minimum FLR in normal and diseased liver tissue (i.e., steatosis, cholestasis, and cirrhosis) ranges from 20% to 40%, it is worthwhile to segment and volumeter the hepatic veins and portal veins and to investigate how much blood content contributes to the error of preoperative CT volumetry.Therefore, segmentation and volumetry of hepatic veins and portal veins on CT images, precise assessment of preoperative FLR in a blood-free setting, and comparison of FLR between blood-filled setting and blood-free setting (i.e., FLR B-filled and FLR B-free ) are essential.
Several studies have reported the use of deep learning (DL) algorithms for volumetry of the right lobe in living donor liver transplantation; however, the difference in the volumetry of the FLR B-filled and FLR B-free has not been investigated [9][10][11][12][13].Moreover, these studies aimed to apply DL algorithms in the preoperative planning of living donor liver transplantation; however, DL models for preoperative FLR assessment of major hepatectomy have rarely been reported and remain unknown.
Several authors have developed deep learning (DL) models for the automated segmentation of hepatic veins and portal veins, which can potentially be used in preoperative FLR assessment prior to major hepatectomy in a blood-free setting; however, these studies have focused on the technical feasibility of developing new DL models to improve segmentation performance, and external validation via the use of various pathologic livers has been ignored [14][15][16][17].How these models perform in real clinical cases, especially under highly variable and complex liver conditions, has not been determined.Severely deformed liver tissue caused by cirrhosis and vascular invasion caused by hepatic tumors can lead to smaller and blur veins than in a healthy liver, and segmentation of hepatic vessels is a major challenge.Almost all these studies mentioned that DL models of hepatic veins and portal veins assist in the planning of hepatic resection [14][15][16][17]; however, how these DL models perform on these pathological livers during real preoperative planning has not been fully evaluated.
Therefore, we aimed to develop a DL model for the automatic segmentation and volumetry of hepatic veins and portal veins, validate the models in an external validation cohort with various liver conditions, and apply the model in combination with preoperative FLR B-filled and FLR B-free assessment prior to major hepatectomy.

Dataset
The training dataset and test dataset were used for the development of a DL model for automated segmentation of Couinaud's liver segment in a previous study [18].
The training dataset was extracted from 2283 consecutive patients who underwent liver contrast-enhanced CT scans at Medical Center A (Peking University First Hospital) between January 2018 and March 2019.A total of 170 patients were included in the training dataset cohort.A flowchart is presented in Fig. 1.Two test datasets were used for the external validation of the DL models.These test datasets were extracted from 1774 consecutive patients who underwent liver contrastenhanced CT scans at Medical Center B (Peking University Shenzhen Hospital) between June 2019 and December 2021.A total of 178 patients were included.
To develop a robust DL model, CT data extracted from patients with various liver pathologies and obtained by different CT manufacturers were included.The various liver pathologies included fatty liver disease secondary to systemic chemotherapy, alcoholic fatty liver disease, alcohol-associated cirrhosis, nonalcoholic fatty liver disease, and hepatic cirrhosis.Patients with focal nodular hyperplasia, hepatic cysts, hepatic adenoma, hemangioma, or hepatocellular carcinoma were included in the test dataset-1.Patients with large hepatic masses (including cholangiocarcinoma, hemangioma, hepatocellular carcinoma, etc.) who were classified as candidates for major hepatectomy were included in dataset 2. The characteristics of the three datasets are shown in Table 1.

Imaging acquisition
CT images were obtained by five CT scanners from three different manufacturers (summarized in Appendix E1 (supplement)).CT images reconstructed at section thicknesses of 1.25 mm and 1 mm were included in this study.

Imaging processing and labelling
We used ITK Snap version 3.8.0 for imaging processing.Major hepatic veins (i.e., the right hepatic vein (RHV), middle hepatic vein, left hepatic vein, superior RHV and inferior RHV were annotated up to the second branch ramification.The main portal vein was fully annotated.The left portal vein and right portal vein were annotated up to the second branch of the ramification (shown in Fig. 2).
All the labels were first annotated by a radiologist and subsequently re-evaluated and corrected by another radiologist (with 10 years of experience in liver imaging and 30 years of experience in radiology); these annotations were regarded as the ground truth.All the images from the training and external validation datasets were processed via this procedure.
DL models for the automated segmentation of the entire liver, Couinaud's liver segments and hepatic mass were trained in our institution for a precise preoperative assessment of FLR% B-free , the dataset cohorts and performances are summarized in Appendix E1 (supplement).
Three-dimensional visualization and quantitative assessment of FLRs were proposed.The key steps are demonstrated in Fig. 3.

Model development
The 3D U-Net network described by Çiçek Ö et al [19] was used for the development of DL models of hepatic veins and portal veins.Another three 3D U-Net frameworks were trained for the segmentation of the entire liver, Couinaud's segments, and hepatic mass (Appendix E1 (supplement)).For the development of 3D U-Net models for hepatic and portal veins, 3D contrast-enhanced CT images were inputted with manual annotation of all veins and branches with a diameter larger than 2 mm, and the output was produced with the predicted annotation.Training, validation, and test data were combined at an 8:1:1 ratio for the development dataset.We use the Dice loss function as the loss function.The prediction accuracy of the DL models was checked on the validation dataset during the training process.We stopped our training when the prediction accuracy started to decrease to prevent overfitting.The resolution of the CT images was 128 × 192 × 256.Image amplification methods, including translation, affine transformation, and random noise were adopted.During the model training, the ADAM gradient descent optimization algorithm was adopted, with a batch size of 2, an initial learning rate of 0.0001 and 400 epochs.We used Python as the programming language.The software used was PyTorch 0.4.1,Python 3.6, Numpy, OpenCV and SimpleITK, and the hardware used was an NVIDIA Tesla P100 16 G GPU for model training.
Model evaluation and qualitative assessment a.For classification of trunks and branches of hepatic and portal veins.The automated results were regarded as accurate when 3/4 of the length of the main portal veins and hepatic veins and 1/2 of the length of the primary branches were accurately and continuously annotated.b.For segmentation of hepatic and portal veins, we used Dice similarity coefficient (DSC) and volumetric similarity (VS) to evaluate the segmentation performance.The DSC was calculated as the voxel overlap between the ground truth (G) and the prediction masks (p).The VS is used to measure the volumetric difference between the ground truth and prediction masks (i.e., V G and V p ).
Calculation of the FLR% The total liver volume (TLV), FLR and hepatic lesion (V Lesion ) were measured on CT images.The ratio of FLR to the nontumor-bearing liver volume was defined as FLR% [20].
FLRs were calculated in the settings of blood-free (FLR% B-free ) and blood-filled (FLR% B-filled ) for each patient in test dataset-2.
d.The prediction of resection based on the FLR% B-free status and FLR% B-filled status The optimal minimal FLR% varies for different liver pathologies, and FLR % values larger than 20%, 30%, and 40% in patients with healthy livers, hepatic steatosis, and cirrhosis, respectively, were classified as candidates for major hepatectomy in this study [21][22][23].

Statistical analysis
For the DL models of hepatic and portal veins, to evaluate the accuracy of trunk and branch classification, the automated and manually labelled classification results were compared.To assess the accuracy of the segmentation and volumetry, the DSC and VS values between the automated and manual segmentation methods were compared.To evaluate the ability of the DL models to assess the preoperative FLR, the differences in FLR% between automated and manual segmentation and between blood-filled and blood-free settings were compared via Bland-Altman analysis.The differences in the prediction of resection between the model and human doctors and between blood-filled and blood-free settings were compared using

Classification accuracy of trunks and branches of hepatic and portal veins
The The accuracy of the segmentation of hepatic and portal veins in test dataset 1 + 2 The average DSCs for the segmentation of hepatic veins and portal veins were 0.66 ± 0.08 (95% CI: (0.65, 0.68)) and 0.67 ± 0.07 (95% CI: (0.66, 0.69)), respectively, and the average VS was 0.80 ± 0.10 (95% CI: (0.79, 0.84)) and 0.74 ± 0.08 (95% CI: (0.73, 0.76)), respectively.According to the DSC results, the differences in the segmentation of hepatic veins between healthy livers and cirrhotic livers, healthy livers and candidates for major hepatectomy, fatty livers and cirrhosis, and fatty livers and candidates for major hepatectomy were statistically significant (p < 0.0001), but no significant differences in portal vein segmentation were found among the subgroups (p = 0.689).For the VS results, no significant differences in segmenting hepatic veins or portal veins were found among the subgroups (p = 0.749 for hepatic veins, p = 0.932 for portal veins) (Fig. 4).For segmentation performance, our results were compared with those of similar studies (Table 3).We validated our models by using the largest test dataset, which included data from most liver conditions.For segmentation of hepatic veins, we obtained similar results to those of Tong [15], Tong [17] and Oh [24], with differences of less than 0.3 in DSC.For the segmentation of portal veins, we obtained the highest DSC values.

Volumetric accuracy of large blood vessels in test dataset 1 + 2
The average volumes of the hepatic veins and portal veins obtained by automated and manual segmentation are shown in Table 4. Volumetry of hepatic veins obtained by automated methods underestimated manual results in all patients (bias was −12.93 mL and −8.67 mL, p < 0.05; 95% limits of agreement (LoA) were −27.25 mL and 1.40 mL; and −29.62 mL and 12.29 mL in test dataset-1 and test dataset-2, respectively).Volumetry of the portal veins obtained by automated methods underestimated the manual results in all patients (bias was −19.69 mL and −14.36 mL, p < 0.05; 95% LoA were −39.53 mL and 0.15 mL; and −28.42 mL and −0.30 mL in test dataset 1 and test dataset 2, respectively) (Fig. 5).

Volumetric accuracy of FLR and FLR% in test dataset 2
The FLR and FLR% assessments are shown in Fig. 6, respectively.No significant differences FLR in B-free or FLR B-filled values were noted between the manual and automated methods using the Mann-Whitney U test (p = 0.67 and 0.66, respectively) (Fig. 6).No significant differences FLR % in B-free or FLR % B-filled were noted between the manual and automated methods using the Mann-Whitney U test (p = 0.59 and 0.99, respectively).
In the blood-filled setting, the volumetry of the FLR B-filled and FLR% B-filled samples ranged from 309.87 to 1277.00 mL (mean volume, 725.99 mL ± 253.09) and 32.82% to 89.67% (mean value, 59.19% ± 16.56%), respectively.In the blood-free setting, the FLR B-free and FLR% B-free volumes ranged from 294.63 to 1256.10 mL (mean volume, 703.89 mL ± 251.06) and from 32.77% to Fig. 4 Box and whisker plot shows the medians of DSC values ranged from 0.62 to 0.70 in subgroups of healthy liver, fatty liver, hepatic cirrhosis and candidates for major hepatectomy in the segmentation of hepatic veins, respectively.For the results of DSC, the differences between subgroups were statistically significant (all p < 0.0001), but no significant differences in the segmentation of portal veins segmentation were found among groups (p = 0.689).For the results of VS, the median values ranged from 0.75 to 0.86, no significant differences among subgroups in both segmenting hepatic veins and portal veins were found (p = 0.749 for hepatic veins, p = 0.932 for portal veins, respectively) 91.87% (mean value, 60.61% ± 17.43%), respectively (Fig. 6).FLR assessments obtained by the blood-free setting slightly underestimated the FLR B-filled by using manual and automated methods (bias = −22.1 mL, −23.04 mL, p < 0.01; 95% LoA = −35.80mL and −8.40 mL; −37.32 mL and −8.78 mL, respectively); FLR assessments obtained by the blood-free setting slightly overestimated the FLR% B-filled by using manual and automated methods (bias = 1.42%, 0.05%, p < 0.05; 95% LoA were −0.71% and 3.55%, −1.59% and 1.68%, respectively) (Fig. 6).
For the volumetric volume of the FLR, we compared our results with those of similar studies via volumetry of the right lobe (shown in Appendix E2 (supplement)).We obtained similar results with Kim's [8] and Kalshabay's [10] methods, with differences of less than 45 mL (accounting for 5.59% of the TLV).However, our results were quite different from those of Gündoğdu's [25] and Park's [9] studies, in which the difference was more than 450 mL (accounting for 55.90% of the TLV).The differences in the volumes of the right lobe may be related to the differences in the study population and the differences in the calculation methods used for the FLR in these studies.Patients enrolled in Gündoğdu's [25] and Park's [9] studies were living liver donors with no hepatic disease or mild fatty liver disease; these findings are quite different from our study population.

Comparison of the prediction of resection in test dataset 2
A total of 10 patients, 21 patients and 1 patient underwent complete left hepatectomy, complete right hepatectomy and extended right hepatectomy, respectively.A total of 128 (32 × 2 × 2) FLR% measurements were obtained and compared.The number of patients categorized as candidates for resection is shown in Table 5.All patients were permitted to undergo major hepatectomy via manual or automated segmentation, based on FLR% B-free or FLR% B-filled assessment results.No significant differences in the prediction of resection were found between the human doctors and the automatic segmentation model (p > 0.99) or between the FLR% B-free assessment and the FLR% B-filled assessment (p > 0.99) according to McNemar's test.

Discussion
Preoperative CT volumetry of FLR has been criticized for under-and over-estimating real FLR mainly because volumetry on CT images is blood-filled while intraoperative volumetry is blood-free [4,6,7].The larger the volume of the blood vessels, the greater the difference between measured FLR and real FLR will be.Blood vessels account for 9% of total liver volume [8], a proportion that has the potential to change the prediction of resection based on FLR% because the minimum FLR% which required to preserved ranged from 20% to 40% before major hepatectomy.So, a precise FLR calculation in a blood-free setting, and a comparison between FLR B-filled and FLR B-free are essential.In this study, we developed and validated DL models for the automatic segmentation of hepatic veins and portal veins and applied this technique for presurgical FLR% assessment both in a bloodfilled setting and a blood-free setting prior to major hepatectomy.The key contributions of this study were that preoperative FLR% assessments and predictions of resection in blood-filled and blood-free settings were fully compared with quantitative and qualitative results, multiple types of hepatectomy were included, and the validation dataset included patients with various liver conditions in clinical practice and candidates who underwent major hepatectomy.
The current results indicated that our model allowed fully automatic segmentation of hepatic veins and portal veins and fully automatic volumetry of FLR% in a blood-free setting and was robust for different pathological livers, even in a spatial external validation dataset.
Our models obtained slightly higher DSC values than Zbinden et al [26] and Oh et al [24] (Table 3).For the segmentation of hepatic veins, our models obtained DSC values similar to those of Tong et al [15] and Tong et al [17].We obtained higher DSC values in patients with healthy livers and fatty livers than in patients with hepatic cirrhosis and large liver tumors (shown in Fig. 4).This was primarily because the hepatic veins in patients with cirrhosis and liver tumors were smaller, more blurred and more difficult to distinguish than those in patients with healthy livers, which increased the difficulty in segmenting hepatic veins.
For the calculations of both the FLR and FLR%, similar studies have focused on the correlations between FLR weight and remnant liver weight [8,27]; however, the differences between FLR B-filled and FLR B-free weight have rarely been analyzed.Our study demonstrated that automated preoperative assessment of FLRs that are B-free and FLR % B-free is feasible, and both results could be used in the prediction of major hepatectomy.Limitations in our study should be noted.First, for the external validation of the preoperative FLR B-free assessment, the validation would be stronger if the volume of the actual liver remnant after hepatectomy was obtained and regarded as a reference.Second, validation on unseen pathologies was lacking, and the FLR was validated in 32 patients who underwent three types of major hepatectomy.Further validation in a larger dataset involving unseen pathologies more types of major hepatectomy is needed.In conclusion, fully automated preoperative assessments of FLRs in blood-free settings are feasible Fig. 6 Box and whisker plot shows preoperative FLR B-free and FLR B-filled (A), FLR% B-free and FLR% B-filled (B) in candidates for major hepatectomy obtained by using manual and automated methods.The central boxes, the middle lines in the central boxes represent the values from 25th to 75th percentile, the medians, respectively.Vertical lines under and upper the boxes extended from the minimum values to the maximum values.Bland-Altman plots for agreement between FLR B-free and FLR B-filled by using manual (C) and automated (D) method; Bland-Altman plots for agreement between FLR% B-free and FLR% B-free by using manual (E) and automated (F)

Manual segmentation 32 32
Automated segmentation 32 32 FLR% B-free: the ratio of future liver remnant to total liver volume measured in blood free setting; FLR% B-filled: the ratio of future liver remnant to total liver volume measured in blood filled setting prior to major hepatectomy, even for different types of resection and various liver conditions.Compared to those of human doctors, the DL models demonstrated similar performance in the final prediction of resection in a spatial external validation dataset.

Fig. 1
Fig. 1 The inclusion criteria, exclusion criteria, and distribution of computed tomography (CT) scans in the data sets used in this study were demonstrated in flowchart.TACE, transcatheter arterial chemo-embolization

Fig. 3
Fig. 3 Key steps in 3D visualization and quantitative future liver remnant (FLR) assessment for preoperative planning

Fig. 5
Fig.5Bland-Altman plots for agreement between manual and automated method in volumetry of hepatic veins (A, D, E, F, G), portal veins (B, H, I, J, K) and large hepatic veins (C).The segmentation models slightly underestimated manual segmentations in healthy liver, fatty liver, hepatic cirrhosis and candidates for major hepatectomy

Table 2
Accuracy of deep learning model in the classification of trunk and branches of hepatic and portal veins in test dataset 1 + 2 (%, 95% confidence intervals) RHV right hepatic vein, MHV middle hepatic vein, LHV left hepatic vein, SRHV superior right hepatic vein, IRHV inferior right hepatic vein, MPV main portal vein, LPV left portal vein, RPV right portal vein

Table 3
Segmentation performance of hepatic vein (HV) and portal vein (PV) compared with literature

Table 5
Number of cases categorized as candidates for major hepatectomy